{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Ywktc8UkWbRv"
   },
   "source": [
    "# Ungraded Lab: Predicting Sunspots with Neural Networks\n",
    "\n",
    "At this point in the course, you should be able to explore different network architectures for forecasting. In the previous weeks, you've used DNNs, RNNs, and CNNs to build these different models. In the final practice lab for this course, you'll try one more configuration and that is a combination of all these types of networks: the data windows will pass through a convolution, followed by stacked LSTMs, followed by stacked dense layers. See if this improves results or you can just opt for simpler models."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "JR3s1iNXW8qv"
   },
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "sLl52leVp5wU"
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UTgnwTJiW-bF"
   },
   "source": [
    "## Utilities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "iDrgVqCTao9j"
   },
   "outputs": [],
   "source": [
    "def plot_series(x, y, format=\"-\", start=0, end=None, \n",
    "                title=None, xlabel=None, ylabel=None, legend=None ):\n",
    "    \"\"\"\n",
    "    Visualizes time series data\n",
    "\n",
    "    Args:\n",
    "      x (array of int) - contains values for the x-axis\n",
    "      y (array of int or tuple of arrays) - contains the values for the y-axis\n",
    "      format (string) - line style when plotting the graph\n",
    "      start (int) - first time step to plot\n",
    "      end (int) - last time step to plot\n",
    "      title (string) - title of the plot\n",
    "      xlabel (string) - label for the x-axis\n",
    "      ylabel (string) - label for the y-axis\n",
    "      legend (list of strings) - legend for the plot\n",
    "    \"\"\"\n",
    "\n",
    "    # Setup dimensions of the graph figure\n",
    "    plt.figure(figsize=(10, 6))\n",
    "    \n",
    "    # Check if there are more than two series to plot\n",
    "    if type(y) is tuple:\n",
    "\n",
    "      # Loop over the y elements\n",
    "      for y_curr in y:\n",
    "\n",
    "        # Plot the x and current y values\n",
    "        plt.plot(x[start:end], y_curr[start:end], format)\n",
    "\n",
    "    else:\n",
    "      # Plot the x and y values\n",
    "      plt.plot(x[start:end], y[start:end], format)\n",
    "\n",
    "    # Label the x-axis\n",
    "    plt.xlabel(xlabel)\n",
    "\n",
    "    # Label the y-axis\n",
    "    plt.ylabel(ylabel)\n",
    "\n",
    "    # Set the legend\n",
    "    if legend:\n",
    "      plt.legend(legend)\n",
    "\n",
    "    # Set the title\n",
    "    plt.title(title)\n",
    "\n",
    "    # Overlay a grid on the graph\n",
    "    plt.grid(True)\n",
    "\n",
    "    # Draw the graph on screen\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AU9FU6-BXE56"
   },
   "source": [
    "## Download and Preview the Dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "miE0cEqdu3eS"
   },
   "outputs": [],
   "source": [
    "# Download the Dataset\n",
    "!wget -nc https://storage.googleapis.com/tensorflow-1-public/course4/Sunspots.csv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "NcG9r1eClbTh"
   },
   "outputs": [],
   "source": [
    "# Initialize lists\n",
    "time_step = []\n",
    "sunspots = []\n",
    "\n",
    "# Open CSV file\n",
    "with open('./Sunspots.csv') as csvfile:\n",
    "  \n",
    "  # Initialize reader\n",
    "  reader = csv.reader(csvfile, delimiter=',')\n",
    "  \n",
    "  # Skip the first line\n",
    "  next(reader)\n",
    "  \n",
    "  # Append row and sunspot number to lists\n",
    "  for row in reader:\n",
    "    time_step.append(int(row[0]))\n",
    "    sunspots.append(float(row[2]))\n",
    "\n",
    "# Convert lists to numpy arrays\n",
    "time = np.array(time_step)\n",
    "series = np.array(sunspots)\n",
    "\n",
    "# Preview the data\n",
    "plot_series(time, series, xlabel='Month', ylabel='Monthly Mean Total Sunspot Number')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Tq34m86mXHk4"
   },
   "source": [
    "## Split the Dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "L92YRw_IpCFG"
   },
   "outputs": [],
   "source": [
    "# Define the split time\n",
    "split_time = 3000\n",
    "\n",
    "# Get the train set \n",
    "time_train = time[:split_time]\n",
    "x_train = series[:split_time]\n",
    "\n",
    "# Get the validation set\n",
    "time_valid = time[split_time:]\n",
    "x_valid = series[split_time:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "jlJDtNMDXkJF"
   },
   "source": [
    "## Prepare Features and Labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "lJwUUZscnG38"
   },
   "outputs": [],
   "source": [
    "def windowed_dataset(series, window_size, batch_size, shuffle_buffer):\n",
    "    \"\"\"Generates dataset windows\n",
    "\n",
    "    Args:\n",
    "      series (array of float) - contains the values of the time series\n",
    "      window_size (int) - the number of time steps to include in the feature\n",
    "      batch_size (int) - the batch size\n",
    "      shuffle_buffer(int) - buffer size to use for the shuffle method\n",
    "\n",
    "    Returns:\n",
    "      dataset (TF Dataset) - TF Dataset containing time windows\n",
    "    \"\"\"\n",
    "\n",
    "    # Add an axis for the feature dimension of RNN layers\n",
    "    series = tf.expand_dims(series, axis=-1)\n",
    "    \n",
    "    # Generate a TF Dataset from the series values\n",
    "    dataset = tf.data.Dataset.from_tensor_slices(series)\n",
    "    \n",
    "    # Window the data but only take those with the specified size\n",
    "    dataset = dataset.window(window_size + 1, shift=1, drop_remainder=True)\n",
    "    \n",
    "    # Flatten the windows by putting its elements in a single batch\n",
    "    dataset = dataset.flat_map(lambda window: window.batch(window_size + 1))\n",
    "\n",
    "    # Create tuples with features and labels \n",
    "    dataset = dataset.map(lambda window: (window[:-1], window[-1]))\n",
    "\n",
    "    # Shuffle the windows\n",
    "    dataset = dataset.shuffle(shuffle_buffer)\n",
    "    \n",
    "    # Create batches of windows\n",
    "    dataset = dataset.batch(batch_size)\n",
    "    \n",
    "    # Optimize the dataset for training\n",
    "    dataset = dataset.cache().prefetch(1)\n",
    "    \n",
    "    return dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "lakVlpDry5iv"
   },
   "source": [
    "As mentioned in the lectures, if your results don't look good, you can try tweaking the parameters here and see if the model will learn better."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "1Z0qRGp1zl5b"
   },
   "outputs": [],
   "source": [
    "# Parameters\n",
    "window_size = 30\n",
    "batch_size = 32\n",
    "shuffle_buffer_size = 1000\n",
    "\n",
    "# Generate the dataset windows\n",
    "train_set = windowed_dataset(x_train, window_size, batch_size, shuffle_buffer_size)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ylQoAXTaXscV"
   },
   "source": [
    "## Build the Model\n",
    "\n",
    "You've seen these layers before and here is how it looks like when combined."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "y6kJd40-0Hj9"
   },
   "outputs": [],
   "source": [
    "# Build the Model\n",
    "model = tf.keras.models.Sequential([\n",
    "    tf.keras.Input(shape=(window_size,1)),\n",
    "    tf.keras.layers.Conv1D(filters=64, kernel_size=3,\n",
    "                      strides=1,\n",
    "                      activation=\"relu\",\n",
    "                      padding='causal'),\n",
    "    tf.keras.layers.LSTM(64, return_sequences=True),\n",
    "    tf.keras.layers.LSTM(64),\n",
    "    tf.keras.layers.Dense(30, activation=\"relu\"),\n",
    "    tf.keras.layers.Dense(10, activation=\"relu\"),\n",
    "    tf.keras.layers.Dense(1),\n",
    "    tf.keras.layers.Lambda(lambda x: x * 400)\n",
    "])\n",
    "\n",
    " # Print the model summary \n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EtSCmjQfXyKi"
   },
   "source": [
    "## Tune the Learning Rate\n",
    "\n",
    "As usual, you will want to pick an optimal learning rate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "CpVZbobXM117"
   },
   "outputs": [],
   "source": [
    "# Get initial weights\n",
    "init_weights = model.get_weights()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "AclfYY3Mn6Ph"
   },
   "outputs": [],
   "source": [
    "# Set the learning rate scheduler\n",
    "lr_schedule = tf.keras.callbacks.LearningRateScheduler(\n",
    "    lambda epoch: 1e-8 * 10**(epoch / 20))\n",
    "\n",
    "# Initialize the optimizer\n",
    "optimizer = tf.keras.optimizers.SGD(momentum=0.9)\n",
    "\n",
    "# Set the training parameters\n",
    "model.compile(loss=tf.keras.losses.Huber(), optimizer=optimizer)\n",
    "\n",
    "# Train the model\n",
    "history = model.fit(train_set, epochs=100, callbacks=[lr_schedule])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "vVcKmg7Q_7rD"
   },
   "outputs": [],
   "source": [
    "# Define the learning rate array\n",
    "lrs = 1e-8 * (10 ** (np.arange(100) / 20))\n",
    "\n",
    "# Set the figure size\n",
    "plt.figure(figsize=(10, 6))\n",
    "\n",
    "# Set the grid\n",
    "plt.grid(True)\n",
    "\n",
    "# Plot the loss in log scale\n",
    "plt.semilogx(lrs, history.history[\"loss\"])\n",
    "\n",
    "# Increase the tickmarks size\n",
    "plt.tick_params('both', length=10, width=1, which='both')\n",
    "\n",
    "# Set the plot boundaries\n",
    "plt.axis([1e-8, 1e-3, 0, 100])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "i-Pn-n60X3uV"
   },
   "source": [
    "## Train the Model\n",
    "\n",
    "Now you can proceed to reset and train the model. It is set for 100 epochs in the cell below but feel free to increase it if you want. Laurence got his results in the lectures after 500."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "QsksvkcXAAgq"
   },
   "outputs": [],
   "source": [
    "# Reset states generated by Keras\n",
    "tf.keras.backend.clear_session()\n",
    "\n",
    "# Reset the weights\n",
    "model.set_weights(init_weights)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "kBYFogmdFM-R"
   },
   "outputs": [],
   "source": [
    "# Set the learning rate\n",
    "learning_rate = 8e-7\n",
    "\n",
    "# Set the optimizer \n",
    "optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=0.9)\n",
    "\n",
    "# Set the training parameters\n",
    "model.compile(loss=tf.keras.losses.Huber(),\n",
    "              optimizer=optimizer,\n",
    "              metrics=[\"mae\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "LrcqLyLALmCY"
   },
   "outputs": [],
   "source": [
    "# Train the model\n",
    "history = model.fit(train_set,epochs=100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "zAzIkJd0L6WD"
   },
   "source": [
    "You can visualize the training and see if the loss and MAE are still trending down."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "MD2kyYUVt3O0"
   },
   "outputs": [],
   "source": [
    "# Get mae and loss from history log\n",
    "mae=history.history['mae']\n",
    "loss=history.history['loss']\n",
    "\n",
    "# Get number of epochs\n",
    "epochs=range(len(loss)) \n",
    "\n",
    "# Plot mae and loss\n",
    "plot_series(\n",
    "    x=epochs, \n",
    "    y=(mae, loss), \n",
    "    title='MAE and Loss', \n",
    "    xlabel='epoch',\n",
    "    ylabel='Loss',\n",
    "    legend=['MAE', 'Loss']\n",
    "    )\n",
    "\n",
    "# Only plot the last 80% of the epochs\n",
    "zoom_split = int(epochs[-1] * 0.2)\n",
    "epochs_zoom = epochs[zoom_split:]\n",
    "mae_zoom = mae[zoom_split:]\n",
    "loss_zoom = loss[zoom_split:]\n",
    "\n",
    "# Plot zoomed mae and loss\n",
    "plot_series(\n",
    "    x=epochs_zoom, \n",
    "    y=(mae_zoom, loss_zoom), \n",
    "    title='MAE and Loss', \n",
    "    xlabel='epoch',\n",
    "    ylabel='Loss',\n",
    "    legend=['MAE', 'Loss']\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "BHxos0eEX7b7"
   },
   "source": [
    "## Model Prediction\n",
    "\n",
    "As before, you can get the predictions for the validation set time range and compute the metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "4XwGrf-A_wF0"
   },
   "outputs": [],
   "source": [
    "def model_forecast(model, series, window_size, batch_size):\n",
    "    \"\"\"Uses an input model to generate predictions on data windows\n",
    "\n",
    "    Args:\n",
    "      model (TF Keras Model) - model that accepts data windows\n",
    "      series (array of float) - contains the values of the time series\n",
    "      window_size (int) - the number of time steps to include in the window\n",
    "      batch_size (int) - the batch size\n",
    "\n",
    "    Returns:\n",
    "      forecast (numpy array) - array containing predictions\n",
    "    \"\"\"\n",
    "\n",
    "    # Add an axis for the feature dimension of RNN layers\n",
    "    series = tf.expand_dims(series, axis=-1)\n",
    "    \n",
    "    # Generate a TF Dataset from the series values\n",
    "    dataset = tf.data.Dataset.from_tensor_slices(series)\n",
    "\n",
    "    # Window the data but only take those with the specified size\n",
    "    dataset = dataset.window(window_size, shift=1, drop_remainder=True)\n",
    "\n",
    "    # Flatten the windows by putting its elements in a single batch\n",
    "    dataset = dataset.flat_map(lambda w: w.batch(window_size))\n",
    "    \n",
    "    # Create batches of windows\n",
    "    dataset = dataset.batch(batch_size).prefetch(1)\n",
    "    \n",
    "    # Get predictions on the entire dataset\n",
    "    forecast = model.predict(dataset, verbose=0)\n",
    "    \n",
    "    return forecast"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "h5f7kKbFL6zv"
   },
   "outputs": [],
   "source": [
    "# Reduce the original series\n",
    "forecast_series = series[split_time-window_size:-1]\n",
    "\n",
    "# Use helper function to generate predictions\n",
    "forecast = model_forecast(model, forecast_series, window_size, batch_size)\n",
    "\n",
    "# Drop single dimensional axis\n",
    "results = forecast.squeeze()\n",
    "\n",
    "# Plot the results\n",
    "plot_series(time_valid, (x_valid, results))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "6jZrRYjQMNHr"
   },
   "outputs": [],
   "source": [
    "# Compute the MAE\n",
    "print(tf.keras.metrics.mae(x_valid, results).numpy())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "OXwSQIdQ0rfT"
   },
   "source": [
    "## Wrap Up\n",
    "\n",
    "This concludes the final practice lab for this course! You implemented a deep and complex architecture composed of CNNs, RNNs, and DNNs. You'll be using the skills you developed throughout this course to complete the final assignment. Keep it up!"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "C4_W4_Lab_3_Sunspots_CNN_RNN_DNN.ipynb",
   "private_outputs": true,
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0rc1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
